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Abstract 

Automatic numerical algorithms are widely used in practice. An algorithm that 
is automatic attempts to provide an approximate solution that differs from the 
true solution by no more than a user-specified error tolerance, e. Furthermore, 
the computational effort required is typically determined adaptively by the algo- 
rithm based on function data, e.g., function values. Ideally, the computational 
cost should match the difficulty of the problem. Unfortunately, most automatic 
algorithms lack rigorous guarantees, i.e., sufficient conditions on the input func- 
tion that ensure the success of the algorithm. 

This article establishes a framework for automatic, adaptive algorithms that 
do have rigorous guarantees. Sufficient conditions for success and upper bounds 
on the computational cost are provided in Theorems Q] and [2] Lower bounds on 
the complexity of the problem are given in Theorem U and conditions are given 
under which the proposed algorithms attain those lower bounds in Corollary [TJ 
These general theorems are illustrated with automatic algorithms for univariate 
numerical integration and function recovery. Both algorithms use linear splines 
to approximate the input function. 

The key idea behind these automatic algorithms is that the error analysis 
should be done for cones of input functions rather than balls. The existing 
literature contains certain negative results about the usefulness and reliability 
of automatic algorithms. The theory presented does not share the assumptions 
on which those negative results are based, and so they are irrelevant. 

Keywords: adaptive, automatic, cones, function recovery, integration, 
quadrature 
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1. Introduction 

One would like to have automatic algorithms for numerical problems that 
are guaranteed to provide answers whose errors do not exceed the user-specified 
error tolerance. Such algorithms are generally lacking, even for fundamental 
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problems such as integration and function recovery. This article shows how to 
construct such algorithms. 

Specifically, we want algorithms with the following properties: 

Guaranteed — the errors in the answers provided must be no greater than 
the user-specified error tolerance, e. 

Automatic — the computational effort required depends on e, but does not 
require additional user input beyond a routine for providing function data. 

Adaptive — the computational effort depends on the input function, specif- 
ically, on the difficulty of the problem as measured by some semi-norm 
of the input function. However, the magnitude of this semi-norm of the 
input function must be inferred from function data, not specified a priori. 

Optimal — the computational cost of the algorithm, as e — > 0, does not ex- 
ceed a constant multiple of the computational complexity of the problem 
(the computational cost of the best possible algorithm). Furthermore, the 
computational cost of this new algorithm, which is not give the semi-norm 
of the input function a priori, does not exceed a constant multiple of the 
cost of the best possible algorithm that does knows the semi-norm of the 
input function a priori. 

Tunable — parameters defining the algorithm can be adjusted to change the 
algorithm's robustness, the maximum allowable computational cost bud- 
get, etc. These parameters are intended to be changed only occasionally, 
as opposed to inputs, which may change each time that the algorithm is 
called. 

The checklist in Figure [T] rates some state-of-the-art algorithms and those pro- 
posed here against these five criteria. Nearly all existing algorithms fail to meet 
one or more of these criteria. Here we present a general framework for con- 
structing algorithms that satisfy all of the above criteria. We also present two 
concrete examples, namely, univariate integration via the composite trapezoidal 
rule and univariate function recovery via linear splines. 



Automatic quadrature algorithms such as MATLAB's quad and quadgk [17 1 
and those in the NAG Library [l8| work well in practice for many cases. But, 
these popular algorithms do not have rigorous guarantees of success. The Cheb- 
fun toolbox in MATLAB [f| approximates functions by Chebyshev polynomial 
expansions and uses those expansions to approximate integrals, solve differential 
equations, etc. Chebfun also works well for many cases, but has no guarantees. 
All of these algorithms are adaptive. The degree to which they are tunable 
varies. 

Most existing theory for numerical problems of interest starts with a Banach 
space, J-, of input functions defined on some set X, and having a semi-norm, \ -\jr- 
The definition of (J 7 , \ -\jr) contains assumptions about smoothness, periodicity 
or other qualities of the input functions. The mathematical problem of interest is 
defined by a solution operator S : J- — > H, where % is some other Banach space 
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Figure 1: Check list of desired features for an algorithm. 



with its norm ||-||^. For integration, H = M, and for function approximation, 
H is some superset of J 7 , for example, Coo{X). One often finds a sequence of 
algorithms, {A n } ra gN, with computational cost n, that provides an approximate 
solution whose error can be bounded as 

\\S(f)-Mf)\\n<^^, ( la ) 

where C up and p are some known constants independent of n. It is necessary 
for A n to be exact if the semi- norm of the input function vanishes, i.e., S(f) = 
A n (f) if l/ljr = 0. Error bound (|la|) allows one to compute an approximation 
that differs from the true solution by no more than e, provided that the input 
functions lie inside a ball, i.e., 

sup \\S(f)-A n (f)\\ H <e, B a = {f G F:\f\ r < a}, (lb) 



n = cost(^4„) = 



(lc) 



Furthermore, there is a simple formula for the computational cost required to 
meet the error tolerance, e, for all functions in the ball of radius a. 

Figure [2] provides a diagram of an automatic algorithm based on these ideas. 
The parameter a must be fixed initially by the user or algorithm designer and 
is typically left unchanged each time the algorithm is called. The algorithm 
then only requires as input the error tolerance and a routine for evaluating the 
function at any point desired. This algorithm is guaranteed. It is optimal if 
any algorithm satisfying the error criterion above must have a cost of at least 
0((a I 'e) 1 ^). However, this algorithm is not adaptive. The computational cost 
does not depend on function data, but is determined purely by the input e and 
the parameter a. In the terminology of information-based complexity theory, 
the computational cost depends on global information but not local information 
0, P- 11-12]. 

The semi-norm \ f\jr may be thought of as the difficulty of the problem. In 
practice, it is difficult to know a priori the maximum possible value of |/|jr, so 
having to determine a in advance is a practical drawback of the algorithm in 
Figure [2j Moreover, since the computational cost of this algorithm depends on 
cr, not |/|jr, the cost does not decrease if \f\jr is drastically smaller than a. 
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Figure 2: Diagram of a guaranteed, non-adaptive algorithm. 



Adaptive algorithms use function data to try to bound \ f\j? or to estimate 
\\S(f) — A n (f)\\ n , so that the computational cost can be adjusted accordingly. 
Unfortunately, the bounds on \f\jr and the estimates of — A n (f)\\ H are 

typically heuristic or valid only in the limit of n — > oo. Thus, these approaches 
are not guaranteed to provide the answer to within the specified error tolerance 
for finite n. Here we show how to construct a rigorous upper bound on \f\jr 
from data, which then leads to a guaranteed automatic and adaptive algorithm. 

The key idea is to consider functions lying in a cone instead of a ball. A 
cone is a set with the property that any positive multiple of an element in the 
set is also in the set. Let Q be some superspace of J- with a weaker semi-norm, 
and define the cone 

C T = {fe T: |/U-<r|/| s }. (2) 

Functions inside the cone may have large semi-norms, but their C/-semi-norm 
can be estimated with reasonable effort. Our approach to obtaining a reliable 
approximation to the true solution, S(f), is sketched here: 

• Since the tj-semi-norm is weaker than the .F-semi-norm, it can be approx- 
imated by some algorithm G n (f), with |/L — G(f) < h+(n) \ f\jr for some 
known h + (n) that vanishes as n — > oo. One chooses a sample size uq such 
that h + (no) < 1/r. 

• The definition of the cone in {5} implies that 

\f\ g < G nG (f) + h+(n G ) \fy < G na (f)+rh + (n G ) \f\ Q , 
and so <G nB (f)/(l-rh + (n G )). 

• Applying the cone condition again leads to an upper bound on the J-- 
semi-norm, namely, \ f\j- < rG„ G (/)/(l - Th + {n G )). 

• This data-driven upper bound on \f\jr can be used with (fTc]) to guarantee 
that the error tolerance will be met using A n with 



n = cost(j4„) < 



C up TG na (/) 
e(l - Th + (n G )) 
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• If, as often is the case, G nG (/) does not overestimate the C/-semi-norm of 
/, one can be assured that the above criterion will be met for 



cost(^4„) < 



Cu P r\f\ g ) 
e(l - rh + (n G )) 



l/p 



Given that \f\jr = r |/U for some /, the cost of this algorithm as e — > 
is asymptotically the same as if one did know \f\-p in advance. 

• On top of the above analysis, we allow for a maximum computational cost 
budget, iV max , that allows the user to specify how long he or she is willing 
to wait for an answer. 

Figure [3] is a diagram of the automatic algorithm just outlined. 
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Output 

A(f) — approximation 
\\S(f)-A(f)\\ n <e 



Parameters 
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\f\r<T\f\g 

N max = cost 
budget 



Figure 3: Diagram of a guaranteed, adaptive, automatic algorithm. 

Section [5] presents a concrete example of this approach for the problem of 
computing the integral J* /(x) dx for all integrands whose second derivatives 
are absolutely integrable, i.e., |/|^ = H/"^. The trapezoidal rule based on n 
function values, i.e., n—1 trapezoids, computes an approximation with error no 

1 function values. However, except for 
i is unknown in practice. 



greater than e using [yj|7"lli /(8e 
relatively simple integrands, a bound on \\f" 

Following the procedure outlined above we choose the weaker semi-norm as 
\f\g = || /'|| 1 . This semi- norm may be estimated by an algorithm G n that takes 
the one-norm of the derivative of the piecewise linear spline for /. The error for 
G n {f) is bounded by ||/"|| 1 /(2n— 2). Fixing the parameter r > 1, and assuming 
that || < t || /'Hi (the cone condition), then the reliable numerical upper 
bound on <[ \f'\\ l can be used to obtain an upper bound for ||/"||i- This leads to 
a guaranteed, automatic (adaptive) algorithm for approximating the integral to 
the desired accuracy with a computational cost that is 0(-»/r ||/'||i /e), where 
||/ Hi is unknown a priori. Here r represents a minimum sample size, and 
1/t represents a length scale of for possible spikes that one wishes to integrate 
accurately. 

There are limited theoretical results providing conditions under which adap- 
tion is useful. Novak 12j shows the advantage of adaption in for some problems 
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in the average case and randomized settings. Plaskota and Wasilkowski 13, 15| 
demonstrate the advantage of adaption for integrating or approximating func- 
tions with singularities. By contrast, here we consider the deterministic setting, 
and we are not concerned with functions with singularities. 

This article starts with the general setting and then moves to two concrete 
cases. Section [2] lays out the problems to be solved. Section 12.21 defines non- 
adaptive algorithms, their costs, and the complexities of numerical problems 
for which non-adaptive algorithms are suited. The algorithm diagrammed in 
Figure [5] is essentially this kind of algorithm. Non-adaptive algorithms are 
the building blocks for adaptive algorithms. Adaptive algorithms, their costs, 
and the complexity of problems defined on cones of functions are introduced in 
Section 12.31 Their costs are defined in terms of the unknown, but estimated, 
C/-semi-norm of the input function. The problem definition here allows for a 
maximum computational cost budget to be imposed so that one never needs to 
wait an arbitrarily large time for the answer. 

Sections [3] and 0] describes the automatic algorithms in detail and provides 
proofs of their success for cones of input functions. Although the long term 
goal of this research is to construct good locally adaptive algorithms, where the 
sampling density varies according to the function data, here we present only 
globally adaptive algorithms, the sampling density is fixed, but the number 
of samples is determined adaptively. In particular, the following results are 
presented: 

• Algorithm [T] is a two stage algorithm: \f\g is bounded above once, and 
this bound is used to determine the additional computational cost, n, so 
that A n (f) achieves the desired accuracy. Algorithm [2] is a multi-stage al- 
gorithm, where the algorithm, G n , used to bound |/L, and the algorithm, 
A n , used to approximate S(f), are based on the same data. Moreover, 
these are embedded algorithms whose costs increase progressively accord- 
ing to some sequence n%, 712, — Once the number of samples is sufficiently 
large so that A ni (/) achieves the desired tolerance, the algorithm termi- 
nates. 

• Theorems [T]and [2] in Section|3]prove that Algorithms[T]and[2l respectively, 
are guaranteed to produce answers to within the desired error tolerance, 
provided that they do not exceed the maximum cost budget. These the- 
orems also demonstrate that the computational costs of these algorithms 
do not exceed the costs of algorithms that know \f\jr or |/L a priori. 

• Theorem [4] in Section [4] provides lower bounds on the computational com- 
plexity of the problems defined on cones of input functions by constructing 
fooling functions. Corollary [T] shows that Algorithms Q] and [5] are optimal 
if the non-adaptive algorithms on which they are based are optimal for 
problems defined on balls of input functions. 

Section [5] illustrates the general theorems in Sections [3] and 0] for the uni- 
variate integration problem. An explicit automatic algorithm based on the 
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trapezoidal rule is presented and its success and optimality are proved. Section 
[6] considers function approximation with analogous results. Common concerns 
about automatic and adaptive algorithms are answered in Section [7J The article 
ends with several suggestions for future work. 

2. General Problem Definition 

2.1. Problems and Algorithms 

The function approximation, integration, or other problem to be solved is 
defined by a solution operator S : J- — >• H, where J 7 is a Banach space of possible 
input functions defined on X with semi-norm \ -\jr, and % is some other Banach 
space of possible outputs or solutions with norm The solution operator is 

assumed to have a scale property, i.e., 

S(cf) 

Examples include the following: 

Integration: S(f) 

Function Recovery: S(f) 

Poisson's Equation: S(f) 

Optimization: S(f) 

The first three examples above are linear problems, which automatically have 
the scale property for S, but the last example is a nonlinear problem, which 
nevertheless also has the scale property. 

The goal is to find an algorithm A : T -> U for which S(f) A(f). Fol- 
lowing the definition of algorithms described in Section 3.2], the algorithm 
takes the form of some function of data derived from the input function: 

A{f) = </>(L(f)), L(f) = (L 1 (f ),..., L m (f)) V/eJP. (3) 

Here the Li £ A are real- valued functions defined on T with the following scale 
property: 

L(cf) = cL(f) V/eJ, eel, L e A. (4) 

One popular choice for A is the set of all function values, A std , i.e., Li(f) = f(xi) 
for some Xi G X. Another common choice is the set of all bounded linear 
functionals, A lm . In general, m may depend on / and the choice of Li may 
depend on L%{f), . . . , Lj_i(/). In this article, all algorithms are assumed to be 
deterministic. There is no random element. 



= cS(f) Vc > 0. 



/ f(x) p(x) dx, p is fixed, 
J x 

/, 

, — Au(x) = f(x), x £ X, 

7/ where v 

' u(x) = Vx G dX, and 

min/(aj). 
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2.2. Non- Adaptive Algorithms 

The set ^4 non (J r , Ti, S, A) contains algorithms as just described for which the 
choice of the Li and the number of function data used by the algorithm, m, are 
both assumed to be independent of the input function, i.e., these algorithms are 
non-adaptive. Furthermore, any A g A non (J-, T~L, S,A) is assumed to satisfy the 
following scale properties: 

L(cf)=cL(f), cb(cy)=ccb(y), A(cf) = cA(f) Vc > 0, yeR m . (5) 

The cost of a non- adaptive algorithm, A £ A n0 n(J r ,'H, S, A), is fixed and is 
defined as the sum of the costs of all the function data: 

cost (A) = = $(Lx) + ■■■ + $(L m ), (6) 

where $ : A — > [l,oo), and $(L) is the cost of acquiring the datum L(f). The 
cost of L may be the same for all L € A, c.g, $(L) = 1. Alternatively, the cost 
might vary with the choice of L. For example, if / is a function of the infinite 
sequence of real numbers, (xi,X2, ■ . .), the cost of evaluating the function with 
arbitrary values of the first d coordinates, L(f) = f{x \ , . . . , Xd, 0, 0, . . .), might 
be d. This cost model has been used by 0,0, El Hi 13] for integration problems 
and [HI, HH, [23| for function approximation problems. 

The error of a non-adaptive algorithm A € ^ non (J r , H, S, A) is defined as 

etx{A,F,H,S) = min{5 > : \\S(f) - A(f)\\ H < S |/|^ V/ e J"}> (7) 

When the problem has real- valued solutions, i.e., H — K, one may also define a 
one sided error criterion: 

eir ± (A,F,R,S)=min{6>0:±[S(f)-A{f)]<6\f\f V/ e J 7 }. (8) 

A finite error in either of the above definitions assumes that the algorithm is 
exact, i.e., S(f) = A(f), for all / with \ f\jr = 0. 

The above error criteria are normalized, meaning that the absolute error, 
— ^4(/)|| n is measured with respect to the ^-semi-norm of the input func- 
tion. The complexity of a problem for this set of algorithms, «4 n on(-?""j %i A), 
is defined as the cost of the cheapest algorithm that satisfies the specified error 
tolerance, e: 

comp(e, Ahm^J 7 , "H, 5*, A)) 

= inf {cost(A) : err(A, F,H,S)<e, A e ^ non (J', H, 5, A)} . (9) 

Here the infimum of an empty set is defined to be oo. So, to guarantee that 
\\S(f) — A(f)\\ n < e, one needs an algorithm with a cost of at least 

comp(e/ \f\r, A non (T, H,S,A)). 

This cost is non-increasing as either e decreases or \ f\jr increases. 
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Suppose that there is a sequence of nonadaptive algorithms indexed by their 
cost, and which converge to the true answer: 

{A n } neX , A n £ AaoaiJ 7 , %, S, A), (10a) 
lim err(A„, T, H, S) = 0, cost(A„) = n, (10b) 

where the countable, non-negative- valued index set, 

X = {m, ri2, . . .} with < rij+i, satishes sup 1+1 < oo. (11) 

This sequence of algorithms is called optimal for the problem (J 7 , H,S, A) if it 
essentially tracks the minimum cost algorithms, namely, 

min{n £ I : err(A n , J", %,S) < e] , . 

sup ; j ._ „, - . < oo. (12) 

o< e <i comp(e, A n0 n{F,rL, S, A)) 

2,5. Automatic, Adaptive Algorithms 

Non-adaptive algorithms, A £ yt non (J, "H, 5*, A) need an upper bound on 
\f\jr to guarantee that they meet the prescribed error tolerance for the input 
function /. Adaptive algorithms attempt to estimate \ f\jr and then determine 
the number of function data needed to meet the error tolerance. Such automatic, 
adaptive algorithms are now defined somewhat differently from the non-adaptive 
algorithms above. However, in practice automatic algorithms use non-adaptive 
algorithms as building blocks. 

Practical automatic algorithms in A(F, H,S, A) take the form of ordered 
pairs of functions 

(A, W) : T x (0, oo) x [1, oo] U x {false, true}, 

for which one hopes that S(f) ~ A(f;e,N niax ). Here e £ (0, oo) is a user- 
supplied error tolerance, N ma _ x £ [1, oo] is a user-supplied maximum cost bud- 
get, and W(f; e, N max ) is a Boolean warning flag that is false if the algorithm 
completed its calculations without attempting to exceed the cost budget, and 
is true otherwise. 

As in ([3]), the algorithm takes the form of some function of function data: 
A(f; e, N mllx ) — <j> (L(f); e, AT max ). Now, however, the algorithm is allowed to 
be adaptive. The choice of may depend on the value of L±(f), the choice of 
L3 may depend on L±(f) and Z/2(/), etc. The number of function data used by 
the algorithm, m, may also be determined adaptively. The choice of how many 
and which function data to use depends on e and N ma ^. Thus, L(cy) might 
not equal cL(y) since the length of the information vector depends on the data 
recorded. The goal of the algorithm is to make HS^/) — A(f; e, N mllx )\\ n < e, 
but this is not a requirement of the definition. 

The cost of the algorithm for a specified input function is defined analogously 
to ^ as the sum of the costs of all function data. 

cost(A /; e, AW) = S(L) = $(ii) + • • • + %{L m ). 
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Because of the potentially adaptive nature of the algorithm, namely that m may 
depend on /, it follows that the cost may depend on / as well as A. The input 
parameter A max tells the algorithm to ensure that cost(A, /; e, N max ) < A max 
for all / and s. This is a practical consideration since the user does not want to 
wait indefinitely for an answer. 

The cost of the algorithm is expected to scale with some (/-semi-norm of 
the integrand, where the (/-semi-norm is weaker than the ^-semi-norm. In 
particular, the cost of an algorithm generally increases as \f\g increases. The 
cost of the algorithm may also depend on AT, the subset of T where the functions 
of interest lie. To represent this idea one defines the following cost: 

cost(A,Af,e,N max ,a) = sup{cost(A, /; e, N max ) :/eJV, \f\ g < a}. 

Here the set Af is allowed to depend on the algorithm inputs, e and A max , but 
not on a. Moreover, the algorithm, A, does not take a as an input. 

For automatic algorithms, returning an approximation with the desired error 
is not enough. One also wants the algorithm to be confident that the answer is 
correct. An algorithm (A, W) £ A(Af, W, S, A), is deemed successful provided 
that it meets the prescribed error tolerance and does not raise the warning flag. 
Specifically, success is defined as 

succ{A,W,N,s,N max ) 

= J true if \\S(f) - A(f; S ,N max )\\ n < e & W(f;e, N max ) = false V/ £ Af, 
1 false otherwise. 

The above are absolute error criteria for success. One might also define relative 
error criteria instead, but finding successful algorithms for relative error is a 
non-trivial exercise and will be considered in future work. 

The complexity of a problem is defined as the cost of the cheapest successful 
algorithm for input functions with C/-semi-norm no greater than a: 

comp(e, A(J\f,H, 5, A), iV max , a) 

= inf {cost(A, A/", e, N m3X , a) : succ(A, W 7 A/", e, A^ max ) = true, 

(A, W) £ A(Af, H, S, A)} . (13) 

Here the infimum of an empty set is defined to be oo. The optimality of adaptive 
algorithms is defined analogously to (fT2|) . 

The set of non-adaptive algorithms, ^4 n on(-^ r , S, A), defined in the previous 
subsection is a subset of the automatic algorithms -4(.F, H, S, A). Algorithms 
in ^4 non (J r , H, S, A) are not affected by the error tolerance e and do not rec- 
ognize a cost budget A^ max . Moreover, the warning flag for an algorithm in 
A non (J 7 , H, S, A) is always returned as false. Whereas the non-adaptive algo- 
rithms are inherently impractical by themselves, they are vital components of 
automatic, adaptive algorithms. 



10 



2.4- Cones of Functions 

All algorithms can be fooled by some input functions, even if these functions 
are sufficiently smooth. An algorithm and error analysis such as that in (TTJ) 
rules out fooling functions with large error by restricting the size of \f\jr- 

It is often difficult to know how large \f\jr is a priori and so practical au- 
tomatic algorithms try to bound it. The framework described here rules out 
fooling functions whose ^-semi-norms cannot be reliably bounded above. This 
is done by considering Q, a superspace of with its own semi-norm |-L. The 
semi- norm |-L is considered to be weaker than in the following sense: 

min \f-f \ g <Cr\f\r V/eJ, (14a) 

where Fa = {/ G J- : \f\jr = 0} is a finite dimensional subspace of J 7 . Moreover, 
it is assumed that any / G Q with zero C/-semi-norm must also lie in T and have 
zero .F-semi-norm: 

Go C Jo. (14b) 

Given t > 0, let C T C T denote a cone of functions whose .F-semi-norms are 
no greater than t times their C/-semi-norms, as defined in @. For any / e C T 
both \f\jr and \f\g must be finite, but they can be arbitrarily large. There is 
no need to assume an upper bound on their sizes, but it is possible to obtain 
reliable upper bounds for both \f\jr and \f\g by sampling /. This argument 
appears to be circular, but is in fact valid. An upper bound on \f\g for / 6 Cl- 
ean be computed in terms of the data L(f) = (Li(f), . . . ,L m (f)) because \-\jr 
is a stronger semi-norm than |-L in the sense of (|14a[) . and because \f\jr is no 
larger than a multiple of \f\g (see Lemma [1] below). This upper bound on \f\g 
then automatically implies an upper bound on \f\jr from the definition of the 
cone. These reliable bounds on both 1/1^ and \f\g may be used to obtain a 
bound on the error of the algorithm for estimating S(f) (see Theorem [1] below). 

2.5. Results that We Prove 

The previous subsections define the problem to be approximated and the 
notation describing the difficulty of the problem and the efficiency of the algo- 
rithms. This subsection summarizes the results that are proved in general in 
the next two sections and illustrated for specific cases in the Sections [5] and [51 

i. Upper bound on the complexity. Theorems [TJ [21 [SJ and [7] provide upper 
bounds on the complexity of solving the problem, comp(A/", e, A^ max , a, A), 
in terms of some function of e/a and for the set Af C C T . 

ii. An algorithm that achieves the upper bound. Algorithms [T] and [2] provide 
explicit, constructive frameworks for two kinds of successful algorithms, 
(A, W) € A{N^'H 1 S 1 h), that achieve these upper bounds. Algorithms [3] 
and 2] provide concrete cases. 

iii. Penalty for not knowing the T- and Q -semi-norms of f. The optimal 
successful algorithm must find an upper bound on \f\jr or \f\g rather than 
assuming such an upper bound. One hopes that the extra cost relative to 
the situation of knowing a priori bounds on these semi-norms is not too 
great. Positive results are given in Theorems [U [21 El and [7] 
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iv. Lower bound on the complexity and optimality of algorithms. The difficulty 
of the problem is provided by lower bounds on the complexity. These 
are given in Theorem 01 Conditions under which Algorithms Q] and [2] 
attain these lower bounds are given in Corollary [TJ Specific examples for 
univariate integration and function recovery are given in Theorems [5] and 

m 



3. General Algorithms and Upper Bounds on the Complexity 

This section provides rather general theorems about the complexity of auto- 
matic algorithms. In some sense, these theorems are a road map or an outline 
because their assumptions are non-trivial and require effort to verify for specific 
problems of interest. On the other hand, the assumptions are reasonable as is 
demonstrated in the Sections [5] and |5] where concrete cases are are discussed. 

3.1. Bounding the Q -Semi-Norm 

As mentioned in Section 12.41 automatic algorithms require reliable upper 
bounds on |/L for all / in the cone C T . These can be obtained using any non- 
adaptive algorithm G n € Aion(-7-\ R+, He , A) for approximating |/L for / e T , 
provided that one has explicit upper bounds on the errors of these algorithms 
as in ©. Here, n = cost(G„). Namely, there should be non-negative valued, 
non-increasing functions h± defined on some subset of the non-negative numbers 
such that 

err ± (G n ,jr,M + ,|.| s ) 

= mm{5 > : ±[\f\ g - G n (f)} < S |/|^ V/ £ J} < h±(n). (15) 

Noting that \ f\jr < T |/U for all / in the cone C T and rearranging the terms in 
the inequality defining the error for G n implies the lemma below. 

Lemma 1. Any nonadaptive algorithm G n G ^4non(-7 r , R+, |-|g , A) with cost 
n = cost(G„) and two sided error bounds as in (I15p yields an approximation to 
the Q -semi-norm of functions in the cone C T with the following upper and lower 
error bounds: 

< ^-<\f\ g <£nG n (f) V/eC T , (16) 

where the deflation factor, l/c„, and the inflation factor, £„ are defined as 
follows: 



1 



1 — rh + (n) 



e n = l + Th-(n)>l, (17) 
> 1, assuming h + (n) < 1/r. (18) 
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3.2. Two-Stage Automatic Algorithms 

Computing an approximate solution to the problem S : C T — » H, e.g., 
integration or function approximation, depends on non-adaptive algorithms. 
Suppose that there is a sequence of such of algorithms, {A n } ne x, with A n £ 
A W Ti{G,'H,S,h) 1 indexed by their cost as defined in (fT0|) . and for which upper 
error bounds are known for both the spaces Q and T: 

eir(A n ,g, U, S) < h(n), eir(A n , JF, U, S) < h{n), (19) 

for some non-increasing functions h and h. The definitions of these errors in (J7J 
then implies upper bounds on the error of A n (f) in terms of the ^/-semi-norm 
of/: 

\\S(J) - A n {f)\\ H < mm(erv(A n ,g,'H,S) \f\ g ,evr(A n ,T,H,S) \f\ F ) 

<min(h(n),Th{n))\f\ g V/eC r . (20) 

Algorithm 1. (Automatic, Adaptive, Two-Stage). Let J 7 , Q, and H 

be Banach spaces as described above, let S be the solution operator, and let 
Mnax be the maximum cost allowed. Let r be a fixed positive number, and let 
G„ G £ ^l non (J r ,M + , \ -\g , A) be an algorithm as described in Lemma Q] with cost 
n G satisfying h + (n G ) < 1/r. Moreover, let {A n } ne z, A„ £ A n0 a(G,H, S, A), be 
a sequence of algorithms as described in (fT0|) and ([T9| . Given a positive error 
tolerance, e, and an input function /, do the following: 

Stage 1. Estimate \f\g. First compute G nG (f). Define the inflation factor 
€ = € nG according to (fT%)) . Then €G na (f) provides a reliable upper 
bound on \f\g- 

Stage 2. Estimate S(f). Choose the sample size need to approximate S(f), 
namely, n A = N A (e/(<£G na (f))), where 

Na(q>) = min (n £ X : min(ft,(n), rh(n)) < a X , «€ (0, oo). (21) 

If ua < N max — uq, then S(f) may be approximated within the desired 
error tolerance and within the cost budget. Set the warning flag, W, 
to false. Otherwise, recompute ua to be within budget, ua = N max :— 
max{fi £ I : n < iV max — no}, and set the warning flag, W to true. 
Finally, compute A nA (f) as the approximation to S(f). 

Return the result (A nA (f), W), at a total cost of ng + ha- 

Theorem 1. Let J- , Q, %, t, N max , N maX! £, and e be given as described in 
Algorithm^ and assume that IF satisfies (jT4j) . Let c = c„ G be defined as in 
(|17[) . Let C T be the cone of functions defined in ^) whose T -semi-norms are no 
larger than r times their Q -semi-norms. Let 

= \feC T : |/U < « 1 (22) 

I Ccmin(/i(iV max ),T/i(iV max )) I 
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be a subset of the cone C T that lies inside a Q-semi-norm ball. Since iV max is 
assumed to be very large, it follows that whx(h(N max ), r/i(-/V max )) is tiny, and 
the radius of the Q-semi-norm ball defining Af is quite large. 

Then it follows that Algorithm]]] is successful for all functions in this set of 
nice functions N , i.e., succ(^4, W, N, e, iV max ) = true. Moreover, the cost of this 
algorithm is bounded above in terms of the unknown Q-semi-norm of the input 
function as follows: 



cost(A,Af,e,N max ,\f\ g ) <n G + N A [^^- J ■ (23) 

Consider the limit of infinite cost budget, i.e., iV max — > oo. // the sequence 
of algorithms {A n } n£ x, A n € A. n0 n(Q, S, A) is optimal for the problems 
(Q S, A) and (J 7 , "H, S, A) as defined in (|12[) . then Algorithm[l\ does not incur 
a significant penalty for not knowing \ f\g a priori, i.e., 

COSt(A,C r ,e,OQ,cr) 

sup -7—. — ( < oo, Je{T,Q}. (24) 

o<e/v<i comp(e /a, A non (J ,H, S, A)) 



Proof. The definition of £ in (fT8|) implies that the true C/-semi-norm of / is 
bounded above by €G nG (/) according to Lemma [TJ The upper bound on the 
error of the sequence of algorithms {A n } ne x in ([20)) then implies that 

\\S(f) - A n (f)\\ H < min(h(n),Th(n))€G na (f) V/ e C T . 

This error upper bound may be made no greater than the error tolerance, e, by 
choosing the algorithm cost, n, to satisfy the condition in Stage 2 of Algorithm 
[U provided that this can be done within the maximum cost budget. In this 
case, the algorithm is successful, as claimed in the theorem. 

To ensure that the algorithm does not attempt to overrun the cost budget, 
one must limit the ^-semi-norm of the input function. The definition of c in 
(fl7|) implies that G na (f) < c\f\g according to Lemma Q] This means that for 
any function, /, with actual ^-semi-norm |/U, the upper bound on its C/-semi- 
norm computed via Lemma [1] is no greater than £c|/U. Thus, after using Ng 
samples to estimate |/L, functions in AT as defined in (|2"21 never need more 
than -/V max — no additional samples to estimate S(f) with the desired accuracy. 
This establishes that Algorithm [T] must be successful for all / S Af, and also 
establishes an upper bound on the computational cost as given in (|23[) . 

Now consider the penalty for not knowing \f\g a priori. If the sequence of 
nonadaptive algorithms, {A n } n ^x, used to construct Algorithm [T] are optimal 
for solving the problem on both T and Q, as defined in (fl"2"|) . then it follows that 
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cost(A,Af, e, oo, a) 

0<e/a<l COmp(e/(T, AnoniJy'H, S, A)) 

cost (A, A/", e, oo, <j) 
o<e/V<i min{n £ I : crr(A n , J, 5) < e/a} 

mm{n £ 1 : eiv(A ni J, H, S) < e/a} 

0<e/a<l C0mp(e/CT, A non (J, H, S, A)) 

The first of these suprema is finite by comparing the convergence rates of the 
sequence algorithms, {A n } ne z, in (fT9|) with the cost of the automatic algo- 
rithm given by ([25)1 . The second of these suprema is finite by the optimality of 
{A n } nex . □ 

There are a several remarks that may facilitate understanding of this result. 
Remark 1. Three main conditions must be checked for this theorem to hold. 

i. There must be an algorithm, G„, as described in Section [3. II that approx- 
imates the weaker semi-norm, |-L, and its error bound, h±, as defined in 
(IT5|) must be known explicitly. 

ii. There must be a sequence of nonadaptive algorithms, {A„}„ e i, as de- 
scribed at the beginning of Section l3~2| and the error functions h and h, 
defined in (fH?)) must be known explicitly. 

iii. Condition (IT21) defining the optimality of {A n } ne x must be satisfied to 
ensure that there is no significant penalty for not having an a priori upper 
bound on |/U. 

Sections [S] and [B] provide concrete examples where these conditions are checked. 

Remark 2. If h is unknown, then one may take h(n) — oo, and the algorithm 
still satisfies the error tolerance with a cost upper bound given in (|23|) . The 
optimality result in (|24|) then only holds for T, and not Q. The analogy holds 
if h is unknown. However, at least one of these two functions h or h, must be 
known for this theorem to be meaningful. 

Remark 3. The cost of Algorithm [1] as given by ([23| . depends on the Q-semi- 
norm of the input function, /. However, |/L is not an input parameter for the 
algorithm, but rather is conservatively estimated by the algorithm. The number 
of samples used to obtain the approximation to S(f) is adjusted accordingly 
based on the estimate of \ fig- 
Remark 4. The definition of the set of algorithms for which the Algorithm Q] 
is guaranteed to work, A/", depends somewhat on \f\g, but only because of the 
practical constraint of a cost budget of N max . This dependence disappears if one 
lifts this constraint by taking N maK — > oo. The primary constraint determining 
the success of the algorithm is that / lies in the cone C T . 
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Remark 5. Instead of choosing r as an input parameter for Algorithm [TJ one 
may alternatively choose the inflation factor £ > 1. This then implies that 



'-H)s^sr (25) 

which is equivalent to (fl8l) . 

Remark 6. It is observed in the examples of Sections [5] and [5] that for the 
sequence of algorithms {A n } n£ x 

Th(n) < h(n) Vn G X. (26) 

or equivalently, min(/i(7i), r/i(n)) = rh(n). This then simplifies Algorithm [T] in 
the computation of the sample size for A in (|2Tj) and also in Theorem [1] in the 
definition of Af in (j22|) and the upper bound on the cost in (f23|) . 
A sufficient condition for (|26l) is 

fc(n) < h(n)h+{n) Vn € I, (27) 

since the Algorithms Q] and [2] both require that h+(n) < 1/r, and since /i is a 
non-increasing function. Another sufficient condition for (|26j) is finding some 
sequence of functions {/* G Jjnei with non-zero J 7 - and C/-semi-norms such 
that 

K n ) < it;] 177| Vn G I, (28) 

\Jn\g \Jn\J r 

since the right hand side above is no greater than the right hand side in (|2"?| . An 
advantage of (|2"5|) is that it does not require h to be known explicitly. Finally, 
suppose that one can construct a sequence of functions {/* G J-} n ex with a ^ 
vanishing data for both the algorithms A n and G n that attains the upper bound 
on the error of A n with respect to the ^-semi-norm, i.e., 

h{n)= PUmn vnel (29) 

Then this sequence automatically satisfies (|28p. which then implies (f26l) . 

5.5. Automatic Algorithms Based on Embedded Algorithms {A n } ne x 
Suppose that the sequence of nonadaptive algorithms, 

L-^-n/nGX {^ni : A n2 

are embedded, i.e., A ni+1 uses all of the data used by A ni for i — 1,2,.... 
An example would be a sequence of composite trapezoidal rules for integration 
where the number of trapezoids is a power of two. Furthermore, it is supposed 
that the data used by G na , the algorithm used to estimate the C/-semi-norm 
of /, is the same data used by A ni , and so n\ = no- Then the total cost of 
Algorithm [T] can be reduced; it is simply ha, as given in Stage 2, instead of 
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tiq + riA- Moreover, N max may then be taken to be AT max , and the cost bound 
of the automatic algorithm in (1231) does not need the term tiq. 

Again suppose that {A n } ne x, A n G „4 non (C/, "H, 5, A), consists of algorithms 
as described in (fTU|) and (TLT)1) . but some of which are embedded in others. An 
example would be all possible composite trapezoidal rules for integration that 
use trapezoids of equal widths. Moreover, suppose that there exists some fixed 
r > 1 such that for all n G X, there exists a n G X with n < h < rn, such 
that the data for A n is embedded in the data for A„. One may think of r 
as the minimum cost multiple that one must incur when moving to the next 
more costly algorithm. For trapezoidal rules, where the cost is the number of 
trapezoids plus one, one may take h — 1 = 2{n — 1), so one may choose r = 2. 

Suppose also that there exists a sequence of algorithms for approximating the 
^-semi-norm, {G„} nE z> G n G A n0 n{G, R+, Ho , A), such that for each n EX, A n 
and G n use exactly the same data. Since h± are non-increasing functions, the 
quantities £ n and c ra do not increase as n increases. These embedded algorithms 
suggest the following iterative algorithm. 

Algorithm 2. Let the Banach spaces J 7 , Q, and H, the solution operator S, 
the maximum cost budget A max , and the positive constant r be as described 
in Algorithm [TJ Let the sequences of algorithms, {A n } n& x and {G„}„ e x be as 
described above. Set i = 1, and n% = min{n G T : h + {n) < 1/t}. For any 
positive error tolerance e and any input function /, do the following: 

Stage 1. Estimate ||/||g. Compute G ni (f) and € ni < oo as defined in (TT5|) . 

Stage 2. Check for Convergence. Check whether rij is large enough to sat- 
isfy the error tolerance, i.e., 

min(h(ni),Th(n i ))<U i G ni {f) < e. (30) 

If this is true, then set W to be false, return (A ni (f),W) and terminate 
the algorithm. 

Stage 3. Compute rii+i. Otherwise, if (f30]) fails to hold, compute t Ui accord- 
ing to (fT7|) using G„ 4 . Choose n.j+i as the smallest number exceeding 
and not less than NA(sc n JG ni (f)) such that A nj is embedded in A ni+1 . 
If rii + i < A^ max , increment i by 1, and return to Stage 1. 

Otherwise, if n^+i > A^ax, choose n^+i to be the largest number not 
exceeding A max such that A ni is embedded in A ni+1 , and set W to be 
true. Return (A ni+1 (f), W) and terminate the algorithm. 

This iterative algorithm is guaranteed to converge also, and its cost can be 
bounded. The following theorem is analogous to Theorem Q] 

Theorem 2. LetJ-, Q , %, N max , t, m, ande be given as described in Algorithm 
[H Assume that n\ < A max . Let r be the minimum cost multiple described in 
the paragraphs preceding that Algorithm^ . Define 

Na{o) = min < n G X : min(ft,(n), Th(n))£ n c n < a > , a € (0, oo). (31) 
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Let C T be the cone of functions defined in ^ whose J- -semi-norms are no larger 
than t times their Q-semi-norms. Let 

= If e C T : |/L < J- L (32) 

{ £Ar max /rCAr max / r min(/i(iV max /r),T/i(A^ max /r)) J 

be the nice subset of the cone C T . Then it follows that Algorithm^ is successful 
for all functions in N , i.e., succ(A,W,Af,e, N max ) = true. Moreover, the cost 
of this algorithm is bounded above in terms of the Q -semi-norm of the input 
function as follows: 

cost(A,M,e,N max ,\f\ g ) < max \rn,rN A • ( 33 ) 

Consider the limit of infinite cost budget, i.e., -/V max — > oo. If the sequence 
of algorithms {A n } ne x, A n £ A n0 n(G, H, S, A) is optimal for the problems 
{Q,%,S,K) and (J 7 , H,S,A) as defined in (|T2l , then Algorithm^ does not incur 
a significant penalty for not knowing |/L a priori, i.e., 

cost(A,C T ,e,oo,a) 

SU P 1—1 A 1 rr n, n \ \ \ < °° I J ^ {F , Q\- 

o<£/<r<i comp(£/(T,^ 

Proof. Let m, . . . ,rij be the sequence of ni generated by Algorithm [21 j being 
the number of the iterate where the algorithm either 

i) terminates successfully because the convergence criterion, (f30)) . is satisfied 
for i = j, or 

ii) terminates with a warning because (|30p is not satisfied for i = j, but the 
the proposed n^+i exceeds the cost budget, N max . 

Here j may be any positive integer. The design of Algorithm [2] guarantees that 
ni < ■ ■ ■ < rij. It is shown that under the hypotheses of this theorem, that 
under condition i), the error tolerance is met and that the cost of the algorithm 
satisfies upper bound (I33|) . It is also shown that condition ii), an unsuccessful 
termination of the algorithm, in particular, / £ Af, is impossible. 

First, consider possibility i) and recall inequality (|16p . Since (|30j) is satisfied, 
it follows that ||<S(/) — A nj (/)||^ < e by the same argument as given in the 
proof of Theorem [TJ In this case the algorithm terminates without warning and 
the approximate value is within the required tolerance. 

If j = 1, then the cost of the algorithm is n\. If j > 1, then the conver- 
gence criterion was not satisfied for i = j — 1. Thus, it follows that Uj-x < 
^a( £ / l/lg)> snice h{n),h(n),€ n , and c„ all do not decrease as n increases. If 
n j-i > NA(£C nj _ 1 /G n -_ 1 (f)), then Stage 3 chooses nj to be the smallest ele- 
ment of X that exceeds nj—i and for which A n ._ x is embedded in A n .. By the 
definition of r it follows that 

nj < rrtj-i < rN A (e/\f\ g ), 
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and the upper bound on the cost in (j3"3")l is satisfied. If, on the other hand, 
rij_i < NA(ECn j _ 1 /G nj _ 1 (f)), then Stage 3 chooses rij to be the smallest el- 
ement of 1 that is no less than NA(sc nj _ 1 /G nj _ 1 (f)) and for which A 7lj _ 1 is 
embedded in A n , . By the definition of r, p^|) . and the definition of Na, it 
follows that 

n 3 < rNAie^JGn^U)) < rN A (e/ \f\ g ) < rN A (s/ \f\ g ). 

Again, the cost of the algorithm is bounded as in (|33l) . 

Second, consider possibility ii), meaning that the convergence criterion, (|30)) . 
is not satisfied for i = j. Then Stage 3 tries to choose rij+i to satisfy this 
criterion. Using similar arguments as in the previous paragraph, it follows that 
rij < Na{e/ \f\g)- If n,j > N A (ec nj /G nj (f)), then the proposed rij+i satisfies 

n j+ i <rn < rN A (e/ \f\ g ). 

If, on the other hand, rij < N A {£C nj /G nj (f)), then the proposed n^+i satisfies 

n j+1 < rN A (sC nj /G nj (f)) < rN A (e/\f\ g ) < rN A (e/\f\ g ). 

In both cases, the rij+i proposed by the algorithm satisfies Uj+i < rN A (e/ |/U), 
which does not exceed the cost budget by the by the definition of N . Thus, 
possibility ii) cannot happen. 

The proof of the optimality of the multistage algorithm follows the same line 
of argument used to prove the optimality of the two-stage algorithm in Theorem 
[TJ This completes the proof. □ 

4. Lower Complexity Bounds for the Algorithms 

Lower complexity bounds are typically proved by constructing fooling func- 
tions. First, a lower bound is derived for the complexity of problems defined 
on T- and C/-semi-norm balls of input functions. This technique is generally 
known, see for example [2(| p. 11-12]. Then it is shown how to extend this idea 
for the cone C T . 

Consider the Banach spaces J-, G, %, and the linear solution operator S : 
Q — > %. Let A be the set of bounded linear functionals that can be used as 
data. Suppose that for any n > 1, and for all L £ A m , satisfying $(£.) < n, 
there exists an fi £ T, depending on n and the Lj, with zero data, solution 
norm one, and J- and Q semi-norms with known upper bounds, namely, 

\\S{h)\\ n = 1, L(A) = 0, \h\ g < ~g(n), \f^ < g(n) \h\ g , (34) 

for some positive, unbounded, non-decreasing functions g and g defined on 
[l,oo). For example, one might have g(n) = an p and g(n) = bn q with posi- 
tive a, b,p, and q. Since the data for the function f\ are all zero, it follows that 
A(/i) = A(— f{) for any algorithm, A, adaptive or not, that is based on the 
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information L(fi). Then, by the triangle inequality and Q34p the error for one 
of the fooling functions ±/i must be at least one: 

max(||5(A) - A(h)\\ n , ||5(-/i) - A(~h)\\ n ) 

= max(||S(A) - A{h)\\ n , \\-S(h) - A(h)\\ H ) 

>\ [115(A) -A{h)\\ H + +A(f 1 )\\ n ] 

> \ II [5(A) - A{h)\ + [5(A) + A{h)\\\ n = \\S{h)\\ H = 1- 

Furthermore, applying (|34l) . for J 6 {J 7 , £/}, it follows that any nonadaptive 
algorithm satisfying the error tolerance e must have a cost n satisfying the 
following inequality: 

e >supW>-^"« 



> 



/*> ll/IU 

max(||5(A) ~ A{h)\\ u , - A(-/0|| w ) 



WhWj 

1 



500' 



J = e, 



- WhWj ~ ) i 



illj : ; J = T. 

g(n)g(n) 7 

This implies lower bounds on the complexity of nonadaptive algorithms, as 
defined in ©: 



COmp(£, A non (J, H,S,A)) > 



srHO, ^ = £> 

(ffS)- 1 ^- 1 ), J = ^, 



where g _1 and (<?g) _1 denote the inverse functions of g and gg, respectively: 

g^ 1 (x) — inf{n > 1 : g(n) > x}, (gg) -1 ^) = inf{n > 1 : g(n)g(n) > x}. 

Thus, the cost of solving the problem within error tolerance e for input functions 
in a ^-semi-norm ball of radius cr is at least g~ 1 (ae~ 1 ) and for input functions in 
a .F-semi-norm ball of radius a is at least (sff) -1 (erE ). This leads to a simple 
check on whether a sequence of non-adaptive algorithms is optimal. 

Proposition 3. Suppose that there exist Tion- adaptive algorithms {A n } n £2; ; 
with A n G *4 n0 n(!?, H, 5, A), /or which the upper error bounds satisfy (|19[) . /or 
known h and h. If 

sup inf{j : jm € I, g(m)h(jm) < 1} < oo, (35a) 

m>l 

i/ien i/iese algorithms are optimal in the sense of (|12[) /or £/ie problem defined 
on Q. If 

sup inf-jj : jm <E I, g(m)g(m)h(jm) < 1} < oo, (35b) 

m>l 

then these algorithms are optimal for the problem defined on T . 
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Proof. For the case of input functions defined on Q it follows that 



min{n € X : err(^4„, Q,%, S) < e} 
<e<i comp(e, Ano^QjH, S, A)) 

min{n £ X : h(n) < s\ 

< sup _ 

0<e<l .9 H £ ) 

minjn e X : g(m)h(n) < 1} , , .. 

< sup fe = mm)) 

m>i m 

minf|7 : im € X, q(m)h(jm) < 1} 

< sup ^— ^ av y v — y ~ J < 00. 

m>i m 

The proof for T follows similarly. □ 

Turning to the problem of solving functions in the cone C r , the lower bound 
on the complexity becomes a bit more difficult to derive. Note that condition 
allows the fooling function f% to lie outside this cone for g{n) > r. Thus, 
when considering the cone of input functions, the fooling function must be 
constructed as a linear combination of fi and a function lying in the interior of 
the cone. 

Suppose that there exists a function fo with non-zero ^-semi-norm lying in 
the interior of the cone C T , i.e., 

|/o| c >0, |/oU<r |/o| g , r <r. (36) 

Furthermore, suppose that for each n > 1, and for all L G A™ 1 , satisfying 
$(L) < n, there exists fi as described above in ([34]) . Linear combinations of fo 
and /i may be used to derive the following lower bound on the complexity of 
solving the problem S for functions in the cone C T . 

Theorem 4. Suppose that functions fo and fi can be found that satisfy condi- 
tions (|34p and (|36p . It then follows that the complexity of the problem, defined 
by (|13|) . assuming infinite cost budget, over the cone of functions C T is 

comp(e, A(C T ,H, S, A), oo, <r) 

^ • f—if °-{t-t ) \ if o-t(t-t ) 

> mln 9 7777, r ' (99) 



2{2r~To)e) ,y ™' \2(2t - t )s 

Proof. Let A be a successful, possibly adaptive, algorithm for all functions lying 
in the cone C T . Given an error tolerance, e, and a positive a, let fo be a function 
satisfying (f36| and choose 

°o = I f I ttt — -r > 0- (37a) 
\h\g (2t - r ) 

Provide the algorithm A with the input function co/o, and let L(cofo) be the 
data vector extracted by A to obtain the estimate A(cofo). Let n = %(L) denote 
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the cost of this algorithm for the function co/o, and define two fooling functions, 
f± = cq/o ± ci/i, in terms of /i satisfying conditions (l34l) with c\ satisfying 



ci 



To)c |/o|c 



(Tt(t - T ) 



\Mg\g(n)+T] \h\ g (2T-T )[g(n)+T] 
These fooling functions must lie inside the cone C T because 



> 0. 



(37b) 



I/±Lf - t \f±\ g < c |/o|^ + d |/i|^ - r(c |/o| - d 

by the triangle inequality 
< cx[g(n) + r] - (r - r )c |/ | e by ©, dSHD 
= by (03). 



Moreover, both fooling functions have t/-semi-norms no greater than c, since 



|/±|g < c |/o| g 

(TT 



< 



It -to 

(TT 



Cl l/llg 

T -T 



2r- 



TQ 



1 



5i(n) + t 

T -To 



by (03 



Following the argument earlier in this section, it is noted that the data used 
by algorithm A for both fooling functions is the same, i.e., L(f±) = L(cofo), 
and so A(f±) = A(cofo)- Consequently, by the same argument used above, 

e > max(||S(/+) - A(f+)\\ n , ||5(/_) - A(f_)\\ n ) > Cl ||5(/i)|| w = Cl . 

Since A is successful for these two fooling functions, c\, as defined in (|37)l . must 
be no larger than the error tolerance, which implies by (|34l) that 

<Jt(t — To) Ot(t — To) | _ | , , r , , . 

— < — = fi g 2r - T )[g(n) + r] 

£ Ci * 

< (2t - To )g(n)[g(n) + r] 

< 2(2r - T )g(n) max(g(n), r). 

Since A is an arbitrary successful algorithm, this inequality provides a lower 
bound on the cost, n, that any such algorithm requires. This then implies the 
lower bound on the complexity of the problem. □ 

Comparing the lower bound on the computational complexity in Theorem 
0] to the upper bounds on the cost of the adaptive algorithms in Theorems [T] 
and [5J one can see that these adaptive algorithms are optimal for solving the 
problem for a cone of input functions if their non-adaptive building blocks are 
also optimal for solving the problem for balls of input functions. 
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Corollary 1. Suppose that the functions g,g,h, and h satisfy the hypotheses 
of Proposition^ i.e., conditions (|35[) . which means that the non-adaptive algo- 
rithms are optimal for solving the problem on J- - and Q -balls of input functions. 
It then follows that Algorithms]]] and\^ are both optimal for solving the problem 
on the cone C T . 

The next two sections illustrate the theorems of Section [3] by looking at the 
problems of integration and approximation. Each section identifies 

• the Banach spaces of input functions, T and Q, and their semi-norms, 

• the Banach space of outputs, H, and its norm, 

• the solution operator, S, 

• a set of non-adaptive algorithms, {A n } ne x, which approximate 5, are 
indexed by their cost, n, and have the property that some lower cost 
algorithms are embedded in higher cost algorithms, 

• the minimum cost multiple, r, defined just before Algorithm^ 

• the upper bound error functions, h and h, defined in (|19[) . 

• a set of non-adaptive algorithms, {G n }„ e x, which approximate \-\g, such 
that G n uses the same function data as A n , 

• the error bounds h±(n) and the deflation and inflation factors, c„ and £„, 
which are defined in (|17l) and (TT8l) respectively, and 

• the fooling functions /o and f±, along with the associated parameter to 
and the functions g and g, all of which satisfy (|34l) and (1551) . 

This allows one to use Algorithms [T] and [5] with the guarantees provided by 
Theorems [T] and [2] the lower bound on complexity provided by Theorem [4j and 
the optimality given by Corollary [TJ 

5. Approximation of One-Dimensional Integrals 

The first example we consider is univariate integration on the unit interval, 
S(f) = Jq 1 f(x)dx. The two spaces of input functions are Sobolev spaces of 
different degrees of smoothness: 

J- = W 2 - 1 and g = W 1 ' 1 , 

where the general Sobolev spaces with smoothness of degree k are defined as 
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The corresponding semi- norms are ||/||jr = ||/"||i and = ||/'||i, respec- 

tively, which correspond to the total variation of /' and /, respectively, and the 
cone of integrands is defined as 

C T = {/GW a ' 1 :||/"||i<r||/ / || 1 }. (39) 

The space of outputs H is M. 

We consider algorithms consider algorithms based on function values, and 
the cost of a function value is one. The non-adaptive building blocks to con- 
struct the adaptive algorithm are the composite trapezoidal rules based onn-1 
trapezoids: 

MI) = ^[/(zi) + 2/(x 2 ) + ■ ■ ■ + 2/(l„_i) + /(in)], Xi = 

In n — 1 

defined for n G I = {2, 3, . . .}, and with cost(A n ) = n. The algorithm A n is 
imbedded in the algorithm A 2n _i, which uses 2n — 2 trapezoids. Thus, any 
trapezoidal rule is embedded in some other trapezoidal rule with cost no more 
than r = 2 times the original one. This is the minimum cost multiple as de- 
scribed just before Algorithm [2l The algorithm for approximating ||/'||i is the 
corresponding norm of the derivative of the linear spline using the same Xi as 
used in the trapezoidal rule: 

n-l 

Gn(f) = 1/(^+0 ~ A 3 *)! ■ (40) 

i=0 

5.1. Adaptive Algorithm and Upper Bound on the Cost 

Constructing the adaptive algorithm for integration requires upper bounds 
on the errors of the A n and G n . The errors of the trapezoidal rule for integrands 
in the spaces W 2 ' 1 and W 1 ' 1 are bounded in terms of h(n) = l/[8(n — l) 2 ] and 
h(n) = l/[2(n— 1)], respectively, according to 0, (7.14) and (7.15)]. Since G n (f) 
never overestimates ||/'||i, it follows that hg_(n) — and c„ = 1. 

To find an upper bound on ||/'||i — G n (/), note that for xi < Xi+\ 

f'(x) = (n-l){[f(x x+1 )-f(xi)} + ^ ' + \n-l)v(t,x)f'(t)dt^ (41a) 



where 



v(t,x) = \ Xl *' x i^^ x > (41b) 
[t—Xi+i, x<t<x i+1 . 
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This implies the following upper bound on a piece of ||/'||i 
' X%+1 \f'(x)\dx-\f(x i+1 )-f( Xi )\ 



< (n- 1) 



x i + 1 px i + 1 



\v(t,x)\\f"(t)\dtdx 



< (n - 1) / 2(t - Si)(a: <+ i - t)|/(*)l d * 
<(n-l) max |2(* — ar i )(ar i+1 — 1)| / |/"(t)|df 

Xi<t<a;i + i y 



< 



Xi+i 



\f"(t)\dt. 



2(n- 1) 

Applying this inequality over the entire interval, [0, 1] leads to 



n-l 



!/%-<?„(/) = £ 



i=l 



< 



|/'(z)|dzH/(*m)-/(^)l 
/"Wlli 



2(n- 1) ^W T . 

v ' i=i Jj; . 



Xi+l 



ir(*)|d* 



2(n-l) 



According to (|15p and , we have h + (n) = l/[2(n— 1)] and an inflation factor of 

1 



£n = 



l-r/(2n-2) 



for ft > 1 + r/2. 



(42) 



By condition (f2"T|) , it follows that min(/i(n), rh(n)) = rh(n), which then 
simplifies some of the expressions in Algorithm [2] and Theorem [2] Specifically, 
the left side of the inequality in ([30j) becomes 



min(/i(n i ),r/i(n i ))e: ni G„ j (/) 



4(n i -l)(2n i -2-r)' 



The function A^ defined in (1211) is 



Na(o) = min {n£l: rh(n) < a} = 1 + y/a/(8r) . 
The denominator in the definition of the set of integrands Af in (|32|) is 

£ JV max /rCAT max / r min(ft( Nmax/r) , rh(N max /r)) 



2(iV m ax-2)(A^ max -2-r) 

The function A^ defined in (f3"T|) is 

Ayi(a) = min |n G X : mm(h(n), Th(n))<L n z n < a| 
/ t r 2 r 



(43a) 



(43b) 



(43c) 





/ T T 


< 1 + 






V 8a + 2 



(43d) 
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With these preliminaries, Algorithm [5] and Theorem[3]may be applied directly to 
yield the following automatic, adaptive integration algorithm and its guarantee. 



Algorithm 3. Let the error tolerance s, the maximum cost budget A max and 
the cone constant t be given inputs. Let the sequence of algorithms {^4n}n£X> 
{G„}„ e i be described above. Set i = 1. Let n\ = \(t + l)/2] + 1. For any 
input function / 6 W 2 ' 1 , do the following: 

Stage 1. Estimate ||/'||i. Compute G ni (f) in (gUJ) - 

Stage 2. Check for convergence. Check whether rii is large enough to sat- 
isfy the error tolerance, i.e. 

4e(n i -l)(2n i -2-r) 



GnA,f)< 

1 

If this is true, then set W to be false, return (A ni (f), W) and terminate 
the algorithm. If this is not true, go to Stage 3. 

Stage 3. Compute n<+i. Choose 

1 + (rii — 1) max < 2, 



H+l 



(rii - 1) 



rGn,(/) 
8e 



If rii+i < N ma _x, increment i by 1, and return to Stage 1. 

Otherwise, if n^+i > A max , choose n^+i to be the largest number not 
exceeding A max such that A ni is embedded in A ni+1 , and set W to be 
true. Return (A ni+1 (/), W) and terminate the algorithm. 

Theorem 5. Let N max , t, m, and e be given as described in Algorithm^ 
Assume that n\ < iV max . Let C T be the cone of functions defined in (|39j). Let 



Af=\feC T :\\f'\\ 1 < 



2e(A^max-2)(A max -2-r) 



be the nice subset of the cone C T . Then it follows that Algorithm^ is successful 
for all functions inAf, i.e., succ(A, W, Af, e, A, nax ) = 1. Moreover, the cost of 
this algorithm is bounded above as follows: 



cost(A,JV',e,JV nu «,||/ / || 1 ) < 2 + 2 



< 2 



rWfh 


r 2 


8e 


h 16 


r\\f% 


r 


8e 


h 2 



2G 



5.2. Lower Bound on the Computational Cost 

Next, we derive a lower bound on the cost of approximating functions in the 
cone C T by constructing fooling functions. Following the arguments of Section 
IH we choose /o(x) — x. Then 

ll/olli = 2, ||tf||i=Q, r = 0. 

For any n £ N, suppose that the one has the data Li(f) = i = 1, • . . ,n 

for arbitrary where = £o < £i < • • ■ < £n < £n+i = 1- There must be some 
j = 0, ... ,n such that — £j > 1/(71 + 1). The function fx, defined as 

30(z -&)'(£ 



otherwise, 



has /(x) dx = 1 and fx{(,i) = for i = 0, . . . , n + 1. Moreover, the norms of 
the first and second derivatives of /i are 

H/IHi = , 15 f) < g(n), 

11/1 1,1 = V3«, +1 -fc)> = W5fc +1 -&) - 5(n) 11/1,11 ' 

where the inequality — £j > 1/(ti + 1) is used to define g and g as 

15(71 + 1) 32(71+ 1) 40(77 +1) 2 

9(n) = — , *(») = -£7g-, («>)(») - ^ • 

Using these choices of fo and /i , along with the corresponding g and g above, one 
may invoke Proposition [3J Theorem [U and Corollary [1] to obtain the following 
theorem. 

Theorem 6. The adaptive trapezoidal algorithm is optimal for integration of 
functions in W ' and W 2 ' 1 . Assuming an infinite cost budget, the complexity 
of the function recovery problem problem, over the cone of functions C T defined 
in (1311 is 



i-om V U.MC r .n.S.\). v.ll/'H,) : n,h. ( J&,j2^jld!l ] - ] 



vsrii/'ii, n 



as > oo. 



V 160e e 
Moreover, Algorithm^ is optimal for approximating functions in C T 
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5.3. Numerical Example 

Consider the family of test functions denned by 

f{x) = be- a ^ x - z ^ , 



(44) 



where z ~ U[l/y/2a,l — l/>/2a], log 10 (a) ~ W[l,4], and b is chosen to make 
L f(x) dx = 1. If a is large, then / is spiky, and it is difficult to approxi- 
mate the integral. From the numerical results, we can find the success rate 
for our algorithm. Straightforward calculations yield the norms of the first two 
derivatives of this test function: 



l/'lli = 2-e 



2 2 2/i \2 

-a z —a (1 — 2) 



1/% = 2a { 2\j- - 2a \ze~ a z + (1 - z)c 



Monte Carlo simulation is used to determine the probability that / € C T for 
r = 10, 100, and 1000 (see Table HJ. 



Algorithm [3] 
T 10 


100 


1000 


quad 


quadgk 


chebf un 


Probability of / € C T 0% 
Observed Success Rate 37% 


25% 
69% 


58% 
97% 


30% 


77% 


68% 



Table 1: Performance of multistage, adaptive Algorithm [3] for various values of r, and also of 
other automatic quadrature algorithms. The test function 1 144 II with random parameters, a 
and z, was used. 



Ten thousand random instances of this test function are integrated by Algo- 
rithm[3]and three existing automatic algorithms with an absolute error tolerance 
of e = 10~ 8 . The observed success rates for these algorithms are shown. As 
expected, the success rate of Algorithm [3] increases as r is increased. All of 
the integrands lying inside C T are integrated to within the error tolerance, and a 
significant number of integrands lying outside the cone are integrated accurately 
as well. 



6. Coo Approximation of Univariate Functions 

Now we consider the problem of Coo recovery of functions defined on [0, 1], 

i.e., 

S(f) = f, U = C 00 , \\S{f)-A{f)\\ H = \\f-A{f)\\ 00 . 

The spaces of functions to be recovered are the Sobolev spaces Q = W 1,0 ° and 
T = W 2 ' 00 , where VV fe '°° was defined in (|38p. The cone of functions for which 
our adaptive algorithms will be shown to work well is defined as 

C T = {/GW ai00 :||/"IL<r||/'|| 00 }. (45) 
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Again, we consider algorithms based on function values, and the cost of a 
function value is one. The basic non-adaptive algorithm used to approximate 
functions in the cone C T is linear interpolation on evenly spaced points. Letting 
T = {2, 3, . . .} as in the previous section, the sequence of algorithms {A n } nS z, 
is defined by 

i — 1 

Xi = -, i=l,...,n, (46a) 

n — 1 



A n{f){x) = (n - 1) [f{Xi){x i+ i - X)+ f{x i+1 )(x - Xi)} , 

Xi < x < Xi + i. (46b) 

The cost of A n , is the number of function evaluations, n. Using this same data 
one may approximate the Coo norm of /' by the algorithm 

<?„(/) = (n-1) sup \f(x i+1 ) - f( Xi )\ . (47) 

i— l,...,n— 1 

Since A ni which uses n — 1 line segments, is embedded in A2 n —i, which uses 
2(n — 1) line segments, the minimum cost multiple is r = 2. 

6.1. Adaptive Algorithm and Upper Bound on the Cost 

Given the algorithms G n and A n , we now turn to deriving the worst case 
error bounds, h± defined in (|15p and h and h defined in (|19[) . For Xi < x < 
note from (1411) that the difference between f'(x) and its linear spline approxi- 
mation is bounded by 

|/'(x)|-(n-l)|/(x. i+1 )~/(^)| 

<(»-i) r +i v(t,x)f(t)dt 

J Xi 

<(n-l)\\f"\\oo J^* 1 Mt>*)\*t 
= (« - 1) liriloo { 2 ( n -l)2 " {X ~ Xl){Xl+1 " X) 

< ii/'iioo 



2(n- 1)' 



Applying the above argument for i = 1, . . . , n — 1 and noting that G n (f) never 
overestimates H/'H^, we have the following two-sided inequality: 

II f'll 

o<||/'|L-g„(/)< "•' U 



2(n-l)' 

Thus, the error bounds on G n and the inflation and deflation factors defined in 
p7|) and (|18l) may be taken to be 

M») = o, ^ + W = ^7 L T? c» = i. g " = i- T/( 1 2n -2) - (48) 
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provided that n > 1 + t/2. 

To derive the error bounds for A n (f), note that for Xi < x < Xi+\ the error 
can be bounded in terms of an integral involving /', or, using integration by 
parts, an integral involving /": 

\f(x) - A n (f)(x)\ 

= \ f( x ) -(n-l) [f(xi)(x i+1 -x) + f(x i+1 )(x - Xi)]\ 



(n-l) 
(n - 1) 



x) / f'(t)dt-{x-Xi) 



(Xi+l 

(Xi+I-X) / f"(t)(t-Xi)dt 

J Xi 
Xi + l 



f'(t)dt 



+ (x - Xi) 



f"(t)(x i+1 -t)dt 



The expression involving /' yields the bound 

\f(x) - A n (f)(x)\ < 2(n - l)(x - Xi)(x i+1 
II/'IL 



x)\\f'\\ 



< 



2(n-l)' 

while the expression involving /" yields the bound 



\f(x)-A n (f)(x)\<(n-l) 

_ (x - Xj)(x l+ i - x) 

2 

This implies that we may take 
1 



. (x X{) (^Cj+i 

(Xj+i - X) h [X - Xi) 



a;) 2 



l/"ll 



iiriL < 



(n-l) 5 



h(n) = 



2(n-l) ! 



h(n) 



< 



1 



(n-l) 2 - 4(n- l) 2 



h{n)h+(ri). (49) 



Since h±(n),h(n), and /i(n), are the same as in the previous section for 
integration, the simplifications in [43] apply here as well. Then Algorithm [2] and 
Theorem [5] may be applied directly to yield the following algorithm for function 
approximation and its guarantee. 

Algorithm 4. Let the error tolerance e, the maximum cost budget iV max , and 
the cone constant r be given inputs. Let the sequences of algorithms, {A n } nG x 
and {G n }n£i be as described above. Set i = 1. Let n\ = \(t + l)/2] + 1. For 
any input function / e J 7 , do the following: 



Stage 1. Estimate H/'H^. Compute G ni (f) in (|47| . 

Stage 2. Check for Convergence. Check whether is large enough to sat- 
isfy the error tolerance, i.e., 



G ni (f)< 



4 E (n i -l)(2n i -2-r) 
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If this is true, then set W to be false, return (A ni (f), W) and terminate 
the algorithm. If this is not true, go to Stage 3. 

Stage 3. Compute rij+i. Choose 

rij+i = 1 + (rii — 1) max < 2, 



(rii - 1) 



8.5 



If rii + i < N max , increment z by 1, and return to Stage 1. 

Otherwise, if n^+i > N max , choose rit+i to be the largest number not 
exceeding iV max such that A ni is embedded in A ni+1 , and set W to be 
true. Return (A ni+1 (f), W) and terminate the algorithm. 

Theorem 7. Let e, N max , r and n\ be given as described in Algorithm [7} 
Assume that n\ < iV max . Let C T be the cone of functions defined in (|45[) . Let 

2e(^V m ax-2)(iV max -2-7 



Af - 



fee £/ie nice subset of the coneC T . Then it follows that Algorithm^ is successful 
for all functions in Af , i.e., succ(A, W, Af, e, AT max ) = 1. Moreover, the cost of 
this algorithm is bounded above as follows: 



coBt(i4,^,e,JV ni ax,||/ / || 00 )<2 + 2 





T 2 


8e 


h 16 


^ll/'lloo 


T 


8e 


^2 



< 2 



6.2. Lower Bound on the Computational Cost 

Next, we derive a lower bound on the cost of approximating functions in the 
cone C T by constructing fooling functions. Following the arguments of Section 
SI we choose fo(x) = x. Then 

ll/olloc = l. ll/olloc=0, T = 0. 

For any n £ N, suppose that the one has the data = f(d), i = 1, ■ ■ • , n 

for arbitrary £j, where = £o < £i < • • • < £n < £n+l = 1- There must be some 
j = 0, . . . , n such that — £j > l/(n + 1). The function /i, defined as 

16(x-0) 2 fe+i-^) 2 



^) 4 



o 



otherwise, 



has H/ill^ = 1 and Afe) = for i = 
first and second derivatives of f\ are 

_ 16 

I OO 



ll/illoo = 

n/r 
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, . . . , n + 1. Moreover, the norms of the 

< g(n), 

<9(n)\\f[\L, 



6V3H/1I 
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where the inequality — £j > l/(n + 1) is used to define 5 and 17 as 
g(n)= 3^ . fl(n)=6>/3(n + l) 1 = 32(n + l) 2 . 

Using these choices of /o and /1 , along with the corresponding g and g above, one 
may invoke Proposition [3J Theorem SI and Corollary [1] to obtain the following 
theorem. 

Theorem 8. The linear spline algorithm is optimal for Coo approximation of 
functions in W l oa and W 2 '°°. Assuming an infinite cost budget, the complexity 
of the function recovery problem problem, over the cone of functions C T defined 
in (gni) is 

com V (e,A(C T ,n,S,A),oo, \\f\U > min ( ^fK \f^Pj 1 

V 128e £ 
Moreover, Algorithm^ is optimal for approximating functions inC T - 

6.3. Numerical Example 

To illustrate Algorithm |4] we choose the same family of test functions as 
in (J44J), but now with z ~ W[0,1], log 10 (a) ~ U[Q,A], and 6=1. When a is 
large this function is very spiky and hard to approximate since the sampled 
function values may miss the spike. Since H/'H^ = a-^/2/e and ||/"|loc = 
2a 2 , the probability that / e C T is min (max(0,log 10 (r/\/2e)/4) , l) . For r = 
10, 100, 1000 we choose a sample of 10000 functions and an error tolerance of 
e = 10 -8 . The empirical probability that Algorithm 0] succeeds in returning an 
answer within the error tolerance is given in Table [2] Our algorithm succeeds 
for all functions lying in the cone plus some others. It is conservative, but not 
overly so. 

t 10 100 1000 

Probability of / e C T 16% 41% 66% 
Observed Success Rate 25% 49% 74% 

Table 2: Comparison between the probability of the input function lying in the cone and the 
empirical success rate of Algorithm [4] 



7. Addressing Questions and Concerns About Automatic Algorithms 

Automatic algorithms are popular, especially for univariate integration prob- 
lems. As mentioned in the introduction, MATLAB @, Ell and the NAG [II] 
library all have one or more automatic integration routines. In spite of this pop- 
ularity certain questions, concerns, or even objections have been raised about 
automatic algorithms. This section attempts to address them. 
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7.1. Automatic Algorithms Can Be Fooled 

In claiming to construct guaranteed automatic algorithms, it must be under- 
stood that the guarantees still require certain assumptions on the input func- 
tions. Any algorithm that solves a problem for involving an infinite-dimensional 
space of input functions can be fooled by a spiky function, i.e., one that yields 
zero data where probed by the algorithm, but is nonzero elsewhere. A guar- 
anteed automatic algorithm specifies sufficient conditions for success, and func- 
tions that are too spiky violate these conditions. Thus, the aim is not to have an 
algorithm that works for all functions, but to know for which functions it is guar- 
anteed to work, and preferably to be able to adjust the algorithm to broaden or 
narrow that set depending on the relevant application and computational cost 
budget. 

7.2. Why Cones? 

Traditional error estimates are derived for balls of input functions, B a , as 
defined in (| lb|) with radius a. The analysis here uses cones, C T , of the form ([2]), 
instead. Automatic algorithms proceed by bounding, perhaps conservatively, 
the approximation error and then increasing the number of function data until 
that error bound is no larger than the prescribed tolerance. These error bounds 
are constructed in such a way that if they are reliable for /, then they are also 
reliable for cf, where c is any real number. Thus, if an algorithm is successful 
for the input function /, then it is also successful for any input of the form 
cf. By definition a cone is just that set that contains cf for all numbers c if it 
contains /. 

One might wonder whether the definition of C T in ([2j is too strict. Ignoring 
the cost budget, an alternative would be to define the cone of success, C succcss , 
consisting of all input functions for which the automatic algorithm successfully 
approximates the solution within the error tolerance. Clearly C succcss contains 
C T , and likely C succoss also contains many functions not in C T . However, the defi- 
nition of C succoss does not provide any insight into what kinds of input functions 
the automatic algorithm can successfully handle. Moreover, the optimality re- 
sults that one can often prove, including those in Sections [3] and [BJ show that 
the cost of the automatic algorithm is within a constant multiple of the cost of 
the best algorithm for C T . 

The automatic algorithms described here require that the width of the cone, 
r, be specified. This requires the user's judgement, and its value cannot be 
determined from the function data. The choice of r reflects how cautious the 
user wants to be, since a larger r leads to an adaptive algorithm with a higher 
cost. 

The condition defining a ball B a involves only one semi-norm, whereas the 
condition defining the cone C T involves two semi-norms. One may wonder 
whether this makes specifying r more difficult than choosing a. As mentioned 
in the introduction, automatic algorithms based on balls are not adaptive and 
have fixed cost. There is no savings in computational cost when the actual 
semi-norm of the input function is much smaller than the radius of the ball. 
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For automatic algorithms designed for balls of input functions, as diagramed in 
Figure [2j a is a measure of the largest input function that the algorithm is de- 
signed to handle. This size is not necessarily something that is easy to bound a 
priori. On the other hand, for automatic algorithms designed for cones of input 
functions, as diagramed in Figure|31 the parameter r is a measure of the nastiest 
input function that the algorithm is designed to handle. For the examples of 
univariate integration in Section [5] and univariate function recovery in Section 
t may be interpreted as a minimum sample size, and 1 /r may be interpreted 
the length scale of the spikiest function that the algorithm can handle. These 
interpretations facilitate the choice of r. 

7.3. Automatic Algorithms that Stop When A ni (f) — A rii l (f) Is Small 

Many automatic algorithms, especially those for univariate integration, are 
based on a sequence of non-adaptive algorithms {A ni } c *L 1 , and a stopping rule 
that returns A ni (f) as the answer for the first i where A ni (f) — A ni _ 1 (f) is small 
enough. Fundamental texts in numerical algorithms advocate such stopping 
rules, e.g. @, p. 223-224], @|, p. 233], and [!l p. 270]. Unfortunately, such 
stopping rules are problematic. 

For instance, consider the univariate integration problem and the trapezoidal 
rule algorithm, A rH , based on rii = 2* + 1 points, i.e., rii — 1 = 2 l trapezoids. It 
is taught that the trapezoidal rule has the following error estimate: 

%{f) ■= A "' (/) ' 3 A "'- l(/) « f Q f{ X ) dx - A ni (/) =: Ek{f). 

Since A ni (f) +!?,(/) is exactly Simpson's rule, the quality of Ei(f) as an error 
estimate is equivalent to the quality of Simpson's rule. Since Ei(f) — E^f) = 
0(16~* 1| / Id); the error estimate will often be good even for moderate i, but 
it can only be guaranteed with some a priori knowledge of ||/^ 4 ''|| 1 - 

In his provocatively titled SIAM Review article, When Not to Use an Auto- 
matic Quadrature Routine [9J, p. 69], James Lyness makes the following claim. 

While prepared to take the risk of being misled by chance alignment 
of zeros in the integrand function, or by narrow peaks which are 
"missed," the user may wish to be reassured that for "reasonable" 
integrand functions which do not have these characteristics all will 
be well. It is the purpose of the rest of this section to demonstrate 
by example that he cannot be reassured on this point. In fact the 
routine is likely to be unreliable in a significant proportion of the 
problems it faces (say 1 to 5%) and there is no way of predicting in 
a straightforward way in which of any set of apparently reasonable 
problems this will happen. 

The following is a summary of Lyness's argument using the notation of the 
present article. To fool an automatic algorithm one constructs an integrand f\, 
which is parameterized by A, such that 
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°' 2 0.2 0^4 06 08 1 

Figure 4: Integrand designed to fool MATLAB's quad along with the data used by quad. 

• fx is "reasonable" for all A € [0, 1], 

• A n (f\) is continuous in A, and 

• for some moderate n,, A ni (fx) — A nil (fx) has different signs for A = 0, 1. 

It then follows that A ni (fx,) — A nil (fx t ) = for some A* € [0, 1]. Then this 
algorithm will likely fail for integrands fx with A nearby A* since the stopping 
criterion is satisfied, but m is not large enough to satisfy the desired error 
tolerance. 

Lyness's argument, with its pessimistic conclusion, is correct for algorithms 
that use the size of A ru (f) — A ni _ 1 (f) as a stopping criterion. Figure 2] depicts 
an integrand constructed in a manner similarly to that described by Lyness. 
We would call this integrand "fluky". MATLAB's quad, which is based on 
adaptive Simpson's rule gives the answer w 0.1733, for an absolute error 
tolerance of e = 1CP 14 , but the true answer is ps 0.1925. The quad routine splits 
the interval of integration into three separate intervals and initially calculates 
Simpson's rule with one and two parabolas for each of the three intervals. The 
data taken are denoted by • in Figure |H Since this fluky integrand is designed 
so that the two Simpson's rules match exactly for each of the three intervals, 
quad is fooled into thinking that it knows the correct value of the integral and 
terminates immediately. 

While Lyness's argument is correct, it does not apply to the algorithms pro- 
posed in this article because their stopping criteria are not based on A ni (f) — 
A ni _ 1 (f). Algorithm [3] presented in Section^ always succeeds for "reasonable" 
integrands because such integrands lie in the cone C T . The analogous statement 
is true for function approximation Algorithm |4l as well as for the general Algo- 
rithms Q] and [2] All of these algorithms succeed for reasonable input functions. 
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Lyness's warning in [9( should not be interpreted as an objection to auto- 
matic algorithms. It is a valid objection to stopping criteria that are based on 
the value of a single functional, in this case, A ni — A nil . What Lyness clearly 
demonstrates is that one may easily make a functional vanish even though the 
error of the algorithm is significant. 



7.4- No Advantage in Adaption 

Although there are some positive results on adaption in 12|, [l3j, |l5[ , there 
are also sweeping, rigorous results from information based complexity theory 
stating that adaptive algorithms have no significant advantage over non-adaptive 
algorithms (e.g., see [19j, Chapter 4, Theorem 5.2.1] and [12|). The automatic 
algorithms presented here do adaptively determine the total number of samples 
required based on the function data collected. The reason that adaption can 
help in this context is that the cone of input functions, C T , is not a convex set. 
This violates one of the assumptions required to prove the negative results that 
adaption does not help. 

To see why C T is not convex, let f- m and / out be functions in T with nonzero 
<J-semi-norms, where /; n lies in the interior of this cone, and / out lies outside 
the cone. This means that 

l/ in l-F _ T < T < _ = l/°"t|jr ^ 

I /in | g | /out | g 

Next define two functions in terms of f m and / ou t as follows: 

f± = (t - Tin) \ fin\g /out ± (r + T out ) |/out|g fin, 

These functions must lie inside C T because 



\f±\z 
l/±ls 



(t - Tin) \fin\g font ± (V + T out ) |/out|g fu 



T 



< 



(t - Tm) \fin\g /out ± (t + T out ) |/out|g fu 

(T - T in ) |/in| g |/out|jr + (r + T out ) |/out| g |/in|jr 
-(T - T in ) \fi a \g | /out | g + (t + T out ) |/ ou t|g |/in| g 
(T - T in )r out + (T + T out )T in _ r(T out + Tin) _ 
-(t- Tin) + (T + Tout) T out + T in 

On the other hand, the average of f±, which is also a convex combination is 

^/- + = ( T _ Tin ) l/inlg /out- 

Since r > r m , this is a nonzero multiple of /out) and it lies outside C r . Thus, 
this cone is not a convex set. 
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8. Discussion and Further Work 

When we consider the landscape of numerical algorithms, it is clear that 
relatively simple problems have well-developed automatic algorithms that we use 
without questioning their reliability. For example, the MATLAB 17] algorithms 



cos, sin, exp, log, airy, besselj, gamma, gammainc, ellipj , erf, and erf inv 

all automatically use the correct number of terms in a suitable expansion needed 
to provide the corresponding function value with 15 significant digit accuracy. 
The only exceptions are cases of unavoidable round-off error, e.g., 



whereas the correct answer is 1. 

We believe that more complex problems, whose inputs are functions, also 
deserve to have automatic algorithms with guarantees of their success. Users 
ought to be able to integrate functions, approximate functions, optimize func- 
tions, etc., without needing to manually tune the sample size. Here we have 
shown how this might be done in general, as well as specifically for two case 
studies. We hope that this will inspire further research in this direction. 

The results presented here suggest a number of other interesting open prob- 
lems. Here is a summary. 

• Here we consider an absolute error tolerance. This analysis should be 
extended to relative error, which is work in progress. 

• The univariate integration and function approximation algorithms in Sec- 
tions [5] and M have low order convergence. Guaranteed automatic algo- 
rithms with higher order convergence rates for smoother input functions 
are needed. These might be based on higher degree piecewise polynomial 
approximations. 

• There are other types of problems, e.g., differential equations and nonlin- 
ear optimization, which fit the general framework presented here. These 
problems also have automatic algorithms, but without guarantees. It 
would be helpful to develop guaranteed automatic algorithms in these 
other areas. 

• The algorithms developed here are globally adaptive, in the sense that 
the function data determines the sample size, but not the kinds of data 
collected or the locations of the sample points, which depend only on the 
sample size. Some existing automatic algorithms are locally adaptive in 
that they collect more data in regions of special interest, say where the 
function has a spike. Such algorithms need guarantees like the ones that 
are provided here for globally adaptive algorithms. The appropriate kinds 
of spaces Q and T ', and their semi-norms, need to be identified for locally 
adaptive algorithms. 



cos(le47 * pi) = -5.432862626006405e - 01, 
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• For some numerical problems the error bound of the non-adaptive algo- 
rithm involves a Q- or ^-semi-norm that is very hard to approximate 
because of its complexity. An example is multivariate quadrature using 
quasi-Monte Carlo algorithms, where the error depends on the variation 
of the integrand. The definition of the variation depends on the defini- 
tion of Q or J 7 , but it is essentially some semi-norm of a mixed partial 
derivative of the integrand. To obtain guaranteed automatic algorithms 
one must either find an efficient way to approximate the variation of the 
function or find other suitable conservative estimates for the error that 
can be reliably obtained from the function data. 

• This article considers only the worst case error of deterministic algorithms. 
There are many random algorithms, and they must be analyzed by some- 
what different methods. A guaranteed Monte Carlo algorithm for estimat- 
ing the mean of a random variable, which includes multivariate integration 
as a special case, has been proposed in [f|. 
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